On the Complexity of t-Closeness Anonymization and Related Problems

نویسندگان

  • Hongyu Liang
  • Hao Yuan
چکیده

An important issue in releasing individual data is to protect the sensitive information from being leaked and maliciously utilized. Famous privacy preserving principles that aim to ensure both data privacy and data integrity, such as k-anonymity and l-diversity, have been extensively studied both theoretically and empirically. Nonetheless, these widely-adopted principles are still insufficient to prevent attribute disclosure if the attacker has partial knowledge about the overall sensitive data distribution. The t-closeness principle has been proposed to fix this, which also has the benefit of supporting numerical sensitive attributes. However, in contrast to k-anonymity and l-diversity, the theoretical aspect of t-closeness has not been well investigated. We initiate the first systematic theoretical study on the t-closeness principle under the commonly-used attribute suppression model. We prove that for every constant t such that 0 ≤ t < 1, it is NP-hard to find an optimal t-closeness generalization of a given table. The proof consists of several reductions each of which works for different values of t, which together cover the full range. To complement this negative result, we also provide exact and fixed-parameter algorithms. Finally, we answer some open questions regarding the complexity of k-anonymity and l-diversity left in the literature.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

D2Pt: Privacy-Aware Multiparty Data Publication

Today, publication of medical data faces high legal barriers. On the one hand, publishing medical data is important for medical research. On the other hand, it is neccessary to protect peoples’ privacy by ensuring that the relationship between individuals and their related medical data remains unknown to third parties. Various data anonymization techniques remove as little identifying informati...

متن کامل

From t-closeness to differential privacy and vice versa in data anonymization

k-Anonymity and ε-differential privacy are two mainstream privacy models, the former introduced to anonymize data sets and the latter to limit the knowledge gain that results from the inclusion of one individual in the data set. Whereas basic k-anonymity only protects against identity disclosure, t-closeness was presented as an extension of k-anonymity that also protects against attribute discl...

متن کامل

From t-Closeness to PRAM and Noise Addition Via Information Theory

t-Closeness is a privacy model recently defined for data anonymization. A data set is said to satisfy t-closeness if, for each group of records sharing a combination of key attributes, the distance between the distribution of a confidential attribute in the group and the distribution of the attribute in the data is no more than a threshold t. We state here the t-closeness property in terms of i...

متن کامل

Anonymization Based Location Privacy Preservation in Vehicular Ad Hoc Networks

The Vehicular Adhoc Networks (VANETs) are used for many safety related applications like accident avoidance, controlling traffic and emergency warning. All these applications require vehicles to broadcast their position, velocity, direction and identity during fixed intervals. It becomes necessary to preserve the location privacy of the vehicle since global attackers can collect obtain these va...

متن کامل

Connecting Randomized Response, Post-Randomization, Differential Privacy and t-Closeness via Deniability and Permutation

We explore some novel connections between the main privacy models in use and we recall a few known ones. We show these models to be more related than commonly understood, around two main principles: deniability and permutation. In particular, randomized response turns out to be very modern in spite of it having been introduced over 50 years ago: it is a local anonymization method and it allows ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013